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[57] ABSTRACT 

An aut(»natic recovery system for a nrtwork appdiaoce 
features a watchdog processor tiiat monitors operation of the 
appliance and initiates reboot as necessary. A primary and a 
secondary boot partitioD are provided in the system, in some 
embodimi»its on the same mass storage device, and in other 
embodiments on a different mass storage device. In the event 
reboot is unsuccessful from the primary boot partition, 
reboot is initiated from die secondary boot partitioo. The 
watchdog processor executes automatic recovery software 
stored in a non-volatilc storage device, which may be a part 
of the same IC as the watchdog processor. 

15 Claims, 2 Drawing Sheets 
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AUTOMATIC RECOVERY FOR NETWORK recovery software is stored in a non-volatile device, so the 

APPLIANCES watchdog processor has access to it and can execute it even 

if the appliance is down. In some embodiments the watch- 
dog processor and tiie non-volatile storage device may be in 
FIELD OF THE INVENTION 5 the same IC. 

The present invention is in the area of methods and .Preferrably the secondary operating software and the 
mmaratus for safeguarding network appliances in the event Pnni«y operating software arc identical, but this is not a 
a catastrophic faUure occurs, and it is particular relevant to requirement In preferred embodiment the watdidog pro- 
auxiliary servers with no long term daU-storage require- cesser attempts to reboot the appliance first from the primary 
mentsandwithnouserinterfaceslike keyboard, mouse and boot Partition, and counts reboot attempts. After a preset 
monitor and so fortL ""^^ atten^jts in a preset tunc, boot attempts arc 

switched to the secondary partition. 

BACKGROUND OF THE INVENTION In other aspects of the invention network ai^xliances are 

provided equipped with the recovery system described, and 

Acomputernetworkmay have one or more i5 otficr aspects, methods for recovery are provided, 
iary servers with no long-term data-storage facilities and no 

keyboard or monitor. TypicaUy, this type of auxiliary savcx BRIEF DESCRffTION OF THE DRAWINGS 
has a small hard drive for its own operating software, and It 

may be used for data transfa functions such as Internet FIG. 1 is a block diagram illustrating a network appliance 

access, electronic mail^ fax service, and remote access. Such 20 equipped with an automatic recovery system according an 

a small auxiliary server is commonly known as a network embodiment of the present invention, 

q^lianoe. FIG. 2 is a block diagram illustrating a possible paitition- 

Althougb network aj^liances have a lower failure rate ing of a hard disk of an automatic recovery system for a 

than major file servers, they may be monitored by a network appliance according an embodiment of the present 

software-controlled device commonly referred to as a ^ invention. 

watchdog. In case a network appliance fails, its watchdog 

wiU sense the failure and attempt to reboot the CPU of the DESCRIITION OF THE PREFERRED 

network appliance using the operating software stored on a EMBODIMENTS 

resident hard disk. If the reboot is successful, the netwwk HG. 1 is a block diagram iUustiating a network aptdiance 

Wliancc wiU resume operation. However, in the event that ^ 13 equipped with an automatic recovwy system according 

a part^the operating software or the appliance's appUca- ^n embodiment of the present invention. Network appliance 

tion software is cQinq)ted, die watchdog will be unaWe to 13 controls a nctw<Hk ports 29 and 30 brtween a network 23 

restore proper operation of the network appUance. and cUeot 27, and it may operate according to any of several 

Consequcnay,Aat network apphanara^ network protocols known in the art Thoe is no requirement 

until a person that IS responsible for Its mamtenance notices 33 to have two pwts 29 and 30, and if there is more than one 

the faUurc and replaces the corrupted software. they can have the same or aU different protocols in any 

What is clearly aeeded is a method to restore automati- combination, 

cally proper operation of a network appliance that has failed Network impliance 13 conmses, but is not limited to, a 

asaresultofcorruptedopcratingorappUcationsaftwareA^ Central Processing Unit (CPU) 25, a disk-type storage 

is stored on tile resident hard disk. Such a method eliminates ^ device (HDD) 41 which configuration includes a piinury 

the need for hmnanintCTvention and, therefore, WiU signifi- bootable partition and a read-only secondary bootable 

^tiy reduce the time that a network ^liance is ncmfiino partition, a watchdog device 21 that may take tiic fcnn of a 

tional. ms disclosure describes such an automatic recovery system-independent miacprocessor, a software rccov- 

^ 45 cry routine 19 that resides within the network appliance 

SUMMARY OF THE INVENTION ^IT^^^^^ J^"?^"* f * ^ ^ 

well as network interface adapters 29 and 30. 

In a preferred emt>odiment an autoitiatic recovery system Watchdog device 21 may be an integrated circuit (IC) that 

for a network appliance having a central processing unit Is integrated into any of the other ICs or it may exist as 

(CPU), a mass storage device, and a non-volatile storage so additional elements in networic ^xpUance 13. Software 

device is provided, comprising a system-independent watch- recovery routine 19 may be stored in the system BIOS 

dog processor coupled to the CPU and to the storage ROM, a ROM, a battery backed-up RAM, or any other 

devices; a primary boot partition on the mass storage device, nonvolatile storage device. 

comprising primary operating software and primary appU- it is jmown to the inventor and in the art that in tfic event 

cation sdtwarc few execution by the CPU in booting tiie 33 of failure of a network ^liance, its watchdog device will 

network appliance and placing it in operation performing its sense tiic faikre and wiU signal CPU 25, by means of ite 

application; a secondary boot partition on die mass storage system bus, an I/O port, an interrupt, a register, memory, or 

device, comprising secondary operating software and sec- any otiia suitable mctfiod, to rdxwt CPU 25 by executing 

ottdary ^Ucation software; and an automatic lecoveiy software programs diat are stared on hard disk 41. However, 

routine on the non-volatile stc^age device. The watchdog <so this method of restwing propa: operation of a network 

jffocessor, executing the automatic recovery routine, in the appliance has a serious problem. This problem lies in the fact 

event of appliance failure initiates a reboot attempt from the that a part of tiie CPU's operating software or the appU- 

sccondary boot partition, ance's ^cation software, both stmd on ha-d disk 41, can 

In some embodiments tiie primary and the secondary boot be corrupted, for example by a glitch in power during update 

partitions are on the same mass storage device, and in others 65 of a file allocation table, and consequenUy, watchdog device 

on separate mass storage devices. The watchdog processor is 21 will be unable to reboot CFV 25 and restore proper 

indq)endent of the CPU of the appliance, and the automatic operation of the network apfdiance. Consequentiy, a network 
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fflmliance with ccanipted software remains unavailable untU wherein the watchdog fHOcessor, executing the automatic 

a human notices the Mure and intervenes. recovery routine, in the event of applianoe failure, 

FIG. 2 is a block diagram illustrating a method to create initiates reboot from ttie second boot partition thereby 

an automatic recovery system by means of uniquely parti- placing the network ^liance back in service without 

tioning the recording ^ace of hard disk 41 (FIG. 1) accord- s human interventioa 

ing an embodiment of the present invention. As shown in 2. An automatic recovery system as in claim 1 further 

FIG. 2, the recording space of hard disk 41 (FIG. 1) is conmsrising a second mass storage device, and wherein the 

partitioned in this embodiment into a data partition 43, a ^^^^ partitions reside respectively on the 

primary boot partition 55 Aat retains botii <^>craUng soft- ^ ^ ^ ^^^^ 

ware 45 and appUcation software 47, and a secondary boot ^ ^ automatic recovery systan as in daim 1 wherein the 

partition 57 or shadow Petition, ^^J^ <^«t^« ^^tchdog processor and die non-volatile storage dcvia 

software 49 that is an «art Aipkcate of <^t»°«J^ft^«^ comprise a ringle integrated circuit (IQ. 

45, and appU^on soft^51 thatis an exact duptote of ""^^^'J^Screco^sys^ as in claim 1 wherein the 

application software 47. Such a sccondaiy partition 57 could auwauuic iwva »waii a» m woixi* 

auTbe stored on a second drive, and die invention is not ^ watchdog processor, executing the ^''^^^^J^''^^ 

^tS to storing die dupUcate software on the same drive. routine, monUoi^eboot attcmpte ^^^^ 

*i / it*«* J- jirn t\ initiates reboot from the second partition as a consequence 

An automatic recovery routme (see item 19 in FIG 1) ^ ^ ^ ^ 

according to this embodiment of the OTCscat invention, i^^ww^ aM,»u^ - ^ ^ \ f.^ 

^tt^oftoenuinberofrebcMtsm^rtoccurwitliiaa l^eprogranmed nunibcr of attempts m a prq.rogta^ 

ootain time period. Tie number of performed rrfx«»s ^ *^ ^ ^011 of daim 1 whadn, after reboot from (he 

indicates whether watdidog device 21 is su«*«fid m ^„y^'Zition. automaticiS/pladng the appliance 

cycles which most hkcly arc caused by a corrupted jmmaiy , V,. .„„ , . . r^ti,,„ 

^^^^-^■^^^'^^.^-rf':^^'^ 'T^fi^ir^S^g a central p«H«sing unit 

S:TSL"S:;Sf.?r:!?^l?cr&t (CPU).amasss:^edeviceh!vingai..'p^o. 

x^^^^Ldinsha^pa^uo^s^..^ 2^::^^^:^^^'^^ '^'^ 

operation of network appliance 13. In some embodiments , ^ . ^ *v 

the application software 51 may be in comiHessed form, and ^ a systcm-ind<5)endcnt watdnlog processes co»^led to the 

be decon^ressed as needed. In otha: embodiments, Ac CPU and to the storage devices; 

rccovay system may also reinstall portions or all of the code a second boot partition on the mass storage device, 

in the first boot partition. Due to this automatic recovery con^arising a copy of the operating software and a copy 

system^ a total catastrophic failure of a networic af^xUance of the application software; and 

can be avoided and therefore will cause only a relatively ^ ^\ttnp(fatio recovery routine on the non-volatUc storage 

brief period cf disruption of service to clients. device; 

It will be apparent to those with skill in tiie art diat tiiere wherein tiie watdidog pioccssor, executing tiic automatic 

are many possible variations for the storage of an automatic recovery routine, in the event of appliance failure 

recovery routine and secondary operation and application ^ initiates a reboot attcnqHfromthe second boot partition 

software. For example, a second hard disk may be used to tiieicby placing the networic appliance back in service 

store bootable operation software as well as application witiiout human intervention. 

software. In addition, the watchdog may be configured to j network ^liance as in chum 6 further conqnsing 

automatically send aror messages to a remote monitoring ^ second mass storage device, and wherein tiie first and 

station on the network when an ^liance fails. second boot partitions reside on s^>arate mass storage 

It will be apparent to tiiose with skill in the art that there devices, 

will be many alterations that might be made in the embodi- g ^ netwodc ai^pliance as in claim 6 wherein the watcfa- 

ments of the Invention described herein without departing processor and the non-volatile stoiagc device comprise 

from the spirit and scope of the invention. Some of the 3 single integrated circuit (IQ. 

variations have already been menttonod. Otiiers include the ^ 9,^. netwoik i^liance as in claim 6 wherein the watch- 

fact that the primary and secondary operating software and ^^^g processor, executing the automatic recovery routine, 

qiplication software do not necessarily have to be exact monitors reboot attejiq>ts from tiie first partition, and ini- 

copies, but do have to be capable of performing the requisite jj^t^g reboot from die second partition as a consequence of 

functions of operation and application. the reboot attempts frt)m the first partition exceeding a 

What daimod is: S5 preprogrammed number of attempts in a preprograimned 

1. An automatic recovery system for a network ^ipliance ^imc period, 

having a central processing unit (CPU), a mass storage aetwork appliance of claim 6 wherein, after 

device having a first tx>ot partition comprising an operating tel)oot from the second boot partition, automatically placing 

system and ^jplication scrftware, and a non-volatile storage jjjc appliance back in service, the automatic recovery system 

device, conqirising: 60 copies die <^>erating system and the plication software 

a system-indcpendrat watchdog processor coupled to the f^m the second boot partition into the first boot portion, 

CPU and to the storage devices; thereiby repairing the first t>oot partition, 

a second boot partition on the mass storage device, IL A mediod fiH- rebooting a network appliance having a 

comprising a copy of die operating software and a copy CPU, a mass storage device having a first and a second boot 

of the plication software; and 65 partition each comprising an identical coating system and 

an automatic recovery routine on the non-volatile storage application software, and a non-volatile storage device, 

device; comprising steps of: 
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(a) monitoring operation of the appliance for failure 
requiting reboot via an on-board watchdog processor 
executing, independently of the CPU, automatic recov- 
ery software stored on the non-volatQc storage device; 

(b) initiating reboot from the fint boot partition on die 
mass storage device in the event of failure requiring 
reboot; and 

(c) initiating reboot from the second boot partitioD on die 
mass storage device in the event that reboot from the 
first boot partition is unsuccessfiiL 

12. The method of daim 11 wherein the second boot 
partition is on a second mass storage device, and. In step (c), 
the reboot from the second boot partition is from the second 
mass storage device. 
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13. The method of daim 11 wherein the watchdog pro- 
cessor and the non-volatile storage device comprise a single 
integrated circuit (IC). 

14. The method of claim 11 wherein, in step (c), the 
watchdog jM-occssor, executing the automatic recovery 
routine, monitors reboot attempts from the first boot 
paxtitioa, and initiates reboot from the second boot partition 
as a consequence of the reboot attempts from the first boot 
partition exceeding a preprogrammed number of astcsapts in 
a pre{»x>grammed time period. 

15. The method of daim 11 further conqjcising a step (d) 
copying the operating system and application software from 
the second boot partition into the first boot partition. 
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